Search CORE

44 research outputs found

Guiding Programmers to Higher Memory Performance

Author: Jensen Nicklas Bo
Karlsson Sven
Ladelsky Razya
Larsen Per
Zaks Ayal
Publication venue
Publication date: 01/01/2012
Field of study

Online Research Database In Technology

Automatic Loop Parallelization via Compiler Guided Refactoring

Author: Karlsson Sven
Ladelsky Razya
Larsen Per
Lidman Jacob
McKee Sally A.
Zaks Ayal
Publication venue: Technical University of Denmark
Publication date: 01/01/2011
Field of study

Online Research Database In Technology

Vapor SIMD: Auto-Vectorize Once, Run Everywhere

Author: Cohen Albert
Dyshel Sergei
Nuzman Dorit
Rohou Erven
Rosen Ira
Williams Kevin
Yuste David
Zaks Ayal
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceJust-in-Time (JIT) compiler technology offers portability while facilitating target- and context-specific specialization. Single-Instruction-Multiple-Data (SIMD) hardware is ubiquitous and markedly diverse, but can be difficult for JIT compilers to efficiently target due to resource and budget constraints. We present our design for a synergistic auto-vectorizing compilation scheme. The scheme is composed of an aggressive, generic offline stage coupled with a lightweight, target-specific online stage. Our method leverages the optimized intermediate results provided by the first stage across disparate SIMD architectures from different vendors, having distinct characteristics ranging from different vector sizes, memory alignment and access constraints, to special computational idioms.We demonstrate the effectiveness of our design using a set of kernels that exercise innermost loop, outer loop, as well as straight-line code vectorization, all automatically extracted by the common offline compilation stage. This results in performance comparable to that provided by specialized monolithic offline compilers. Our framework is implemented using open-source tools and standards, thereby promoting interoperability and extendibility

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Vapor SIMD: Auto-Vectorize Once, Run Everywhere

Author: Cohen Albert
Dyshel Sergei
Nuzman Dorit
Rohou Erven
Rosen Ira
Williams Kevin
Yuste David
Zaks Ayal
Publication venue: HAL CCSD
Publication date: 04/04/2011
Field of study

INRIA a CCSD electronic archive server

MILEPOST GCC: machine learning based research compiler

Author: Ashton Elton
Barnard Phil
Bodin Francois
Bonilla Edwin
Courtois Eric
Fursin Grigori
Leather Hugh
Mendelson Bilha
Miranda Cupertino
Namolaru Mircea
O'Boyle Michael
Temam Olivier
Thomson John
Williams Christopher K. I.
Yom-Tov Elad
Zaks Ayal
Publication venue
Publication date: 01/01/2008
Field of study

International audienceTuning hardwired compiler optimizations for rapidly evolving hardware makes porting an optimizing compiler for each new platform extremely challenging. Our radical approach is to develop a modular, extensible, self-optimizing compiler that automatically learns the best optimization heuristics based on the behavior of the platform. In this paper we describe MILEPOST GCC, a machine-learning-based compiler that automatically adjusts its optimization heuristics to improve the execution time, code size, or compilation time of specific programs on different architectures. Our preliminary experimental results show that it is possible to considerably reduce execution time of the MiBench benchmark suite on a range of platforms entirely automatically

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

University of St. Andrews - Pure

HAL-Rennes 1

ACOTES project: Advanced compiler technologies for embedded streaming

Author: Albert Cohen
Alex Ramírez
Andrea Ornstein
Antoniu Pop
Ayal Zaks
Cupertino Miranda
Cédric Bastoul
David Ródenas
Dorit Nuzman
E. Blossom
E.A. Lee
Eduard Ayguadé
Erven Rohou
Harm Munk
Ira Rosen
J. Hoogerbrugge
Konrad Trifunović
Louis-Noël Pouchet
M. Gschwind
M. Wolfe
Marc Duranton
Marco Cornero
Menno Lindwer
Mohammed Fellahi
Paul Carpenter
Philippe Dumont
R. Allen
R.G. Scarborough
Razya Ladelsky
Roger Ferrer
S. Campanoni
Sebastian Pop
Uzi Shvadron
Xavier Martorell
Zbigniew Chamski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded applications and DSPs. However, programming efficiently for streaming architectures is a challenging task, having to carefully partition the computation and map it to processes in a way that best matches the underlying streaming architecture, taking into account the distributed resources (memory, processing, real-time requirements) and communication overheads (processing and delay). These challenges have led to a number of suggested solutions, whose goal is to improve the programmer’s productivity in developing applications that process massive streams of data on programmable, parallel embedded architectures. StreamIt is one such example. Another more recent approach is that developed by the ACOTES project (Advanced Compiler Technologies for Embedded Streaming). The ACOTES approach for streaming applications consists of compiler-assisted mapping of streaming tasks to highly parallel systems in order to maximize cost-effectiveness, both in terms of energy and in terms of design effort. The analysis and transformation techniques automate large parts of the partitioning and mapping process, based on the properties of the application domain, on the quantitative information about the target systems, and on programmer directives. This paper presents the outcomes of the ACOTES project, a 3-year collaborative work of industrial (NXP, ST, IBM, Silicon Hive, NOKIA) and academic (UPC, INRIA, MINES ParisTech) partners, and advocates the use of Advanced Compiler Technologies that we developed to support Embedded Streaming.Peer ReviewedPostprint (published version

HAL-CentraleSupelec

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

INRIA a CCSD electronic archive server

HAL-MINES ParisTech

The University of Manchester - Institutional Repository

HAL-Rennes 1

Milepost GCC: Machine Learning Enabled Self-tuning Compiler

Author: Abdul Wahid Memon
Ayal Zaks
Bilha Mendelson
Christopher K. I. Williams
Edwin Bonilla
Elad Yom-Tov
Elton Ashton
Eric Courtois
Francois Bodin
G. Fursin
Grigori Fursin
J. Ullman
John Thomson
Michael O’Boyle
Mircea Namolaru
Olivier Temam
P. Larra naga
Phil Barnard
R. Duda
R. El-Yaniv
R. Vuduc
Yuriy Kashnikov
Zbigniew Chamski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

International audienceTuning compiler optimizations for rapidly evolving hardwaremakes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization is a popular approach to adapting programs to a new architecture automatically using feedback-directed compilation. However, the large number of evaluations required for each program has prevented iterative compilation from widespread take-up in production compilers. Machine learning has been proposed to tune optimizations across programs systematically but is currently limited to a few transformations, long training phases and critically lacks publicly released, stable tools. Our approach is to develop a modular, extensible, self-tuning optimization infrastructure to automatically learn the best optimizations across multiple programs and architectures based on the correlation between program features, run-time behavior and optimizations. In this paper we describeMilepostGCC, the first publicly-available open-source machine learning-based compiler. It consists of an Interactive Compilation Interface (ICI) and plugins to extract program features and exchange optimization data with the cTuning.org open public repository. It automatically adapts the internal optimization heuristic at function-level granularity to improve execution time, code size and compilation time of a new program on a given architecture. Part of the MILEPOST technology together with low-level ICI-inspired plugin framework is now included in the mainline GCC.We developed machine learning plugins based on probabilistic and transductive approaches to predict good combinations of optimizations. Our preliminary experimental results show that it is possible to automatically reduce the execution time of individual MiBench programs, some by more than a factor of 2, while also improving compilation time and code size. On average we are able to reduce the execution time of the MiBench benchmark suite by 11% for the ARC reconfigurable processor.We also present a realistic multi-objective optimization scenario for Berkeley DB library using Milepost GCC and improve execution time by approximately 17%, while reducing compilatio

HAL-CentraleSupelec

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Hal-Diderot

University of St. Andrews - Pure

HAL UVSQ

HAL-Rennes 1

Algorithmic aspects of acyclic edge colorings

Author: Ayal Zaks
Noga Alon
Publication venue
Publication date
Field of study

A proper coloring of the edges of a graph G is called acyclic if there is no 2-colored cycle in G. The acyclic edge chromatic number of G, denoted by a ′ (G), is the least number of colors in an acyclic edge coloring of G. For certain graphs G, a ′ (G) ≥ ∆(G) + 2 where ∆(G) is the maximum degree in G. It is known that a ′ (G) ≤ ∆ + 2 for almost all ∆-regular graphs, including all ∆-regular graphs whose girth is at least c ∆ log ∆. We prove that determining the acyclic edge chromatic number of an arbitrary graph is an NP-complete problem. For graphs G with sufficiently large girth in terms of ∆(G), we present deterministic polynomial time algorithms that color the edges of G acyclically using at most ∆(G) + 2 colors.

CiteSeerX

Neighborly families in $E^{d}$ E d consisting of either d-pyramids over ( $d-1$ d - 1 )-cubes, or else of d-cubes

Author: Ayal Zaks
B Grünbaum
J Zaks
JD Simon
Joseph Zaks
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref